Multiscale Geometric Methods for Estimating Intrinsic Dimension

نویسندگان

  • Anna V. Little
  • Mauro Maggioni
  • Lorenzo Rosasco
چکیده

We present a novel approach for estimating the intrinsic dimension of certain point clouds: we assume that the points are sampled from a manifold M of dimension k, with k << D, and corrupted by D-dimensional noise. When M is linear, one may analyze this situation by PCA: with no noise one would obtain a rank k matrix, and noise may be treated as a perturbation of the covariance matrix. WhenM is a nonlinear manifold, global PCA may dramatically overestimate the intrinsic dimension. We discuss a multiscale version of PCA and how one can extract estimators for the intrinsic dimension that are highly robust to noise, and we derive some of their finite-sample-size properties. Keywords— Dimension estimation, multiscale analysis, geometric measure theory, point cloud data

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiscale Geometric Methods for Data Sets I: Multiscale SVD, Noise and Curvature

Large data sets are often modeled as being noisy samples from probability distributions μ in R, withD large. It has been noticed that oftentimes the supportM of these probability distributions seems to be well-approximated by low-dimensional sets, perhaps even by manifolds. We shall consider sets that are locally well approximated by k-dimensional planes, with k ≪ D, with k-dimensional manifold...

متن کامل

Some recent advances in multiscale geometric analysis of point clouds

We discuss recent work based on multiscale geometric analysis for the study of large data sets that lie in high-dimensional spaces but have low-dimensional structure. We present three applications: the first one to the estimation of intrinsic dimension of sampled manifolds, the second one to the construction of multiscale dictionaries, called geometric wavelets, for the analysis of point clouds...

متن کامل

Fractal-Based Methods as a Technique for Estimating the Intrinsic Dimensionality of High-Dimensional Data: A Survey

The estimation of intrinsic dimensionality of high-dimensional data still remains a challenging issue. Various approaches to interpret and estimate the intrinsic dimensionality are developed. Referring to the following two classifications of estimators of the intrinsic dimensionality – local/global estimators and projection techniques/geometric approaches – we focus on the fractalbased methods ...

متن کامل

Multiscale Estimation of Intrinsic Dimensionality of Data Sets

We present a novel approach for estimating the intrinsic dimensionality of certain point clouds: we assume that the points are sampled from a manifold M of dimension k, with k << D, and corrupted by D-dimensional noise. When M is linear, one may analyze this situation by SVD: with no noise one would obtain a rank k matrix, and noise may be treated as a perturbation of the covariance matrix. Whe...

متن کامل

Multi-Resolution Geometric Analysis for Data in High Dimensions

Large data sets arise in a wide variety of applications and are often modeled as samples from a probability distribution in high-dimensional space. It is sometimes assumed that the support of such probability distribution is well approximated by a set of low intrinsic dimension, perhaps even a lowdimensional smooth manifold. Samples are often corrupted by high-dimensional noise. We are interest...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010